Apache Hadoop 3.3.5 – HDFS Short 您所在的位置:网站首页 hadoop go Apache Hadoop 3.3.5 – HDFS Short

Apache Hadoop 3.3.5 – HDFS Short

#Apache Hadoop 3.3.5 – HDFS Short| 来源: 网络整理| 查看: 265

HDFS Short-Circuit Local Reads Short-Circuit Local Reads Background Setup Example Configuration Legacy HDFS Short-Circuit Local Reads Short-Circuit Local Reads Background

In HDFS, reads normally go through the DataNode. Thus, when the client asks the DataNode to read a file, the DataNode reads that file off of the disk and sends the data to the client over a TCP socket. So-called “short-circuit” reads bypass the DataNode, allowing the client to read the file directly. Obviously, this is only possible in cases where the client is co-located with the data. Short-circuit reads provide a substantial performance boost to many applications.

Setup

To configure short-circuit local reads, you will need to enable libhadoop.so. See Native Libraries for details on enabling this library.

Short-circuit reads make use of a UNIX domain socket. This is a special path in the filesystem that allows the client and the DataNodes to communicate. You will need to set a path to this socket. The DataNode needs to be able to create this path. On the other hand, it should not be possible for any user except the HDFS user or root to create this path. For this reason, paths under /var/run or /var/lib are often used.

The client and the DataNode exchange information via a shared memory segment on /dev/shm.

Short-circuit local reads need to be configured on both the DataNode and the client.

Example Configuration

Here is an example configuration.

dfs.client.read.shortcircuit true dfs.domain.socket.path /var/lib/hadoop-hdfs/dn_socket Legacy HDFS Short-Circuit Local Reads

Legacy implementation of short-circuit local reads on which the clients directly open the HDFS block files is still available for platforms other than the Linux. Setting the value of dfs.client.use.legacy.blockreader.local in addition to dfs.client.read.shortcircuit to true enables this feature.

You also need to set the value of dfs.datanode.data.dir.perm to 750 instead of the default 700 and chmod/chown the directory tree under dfs.datanode.data.dir as readable to the client and the DataNode. You must take caution because this means that the client can read all of the block files bypassing HDFS permission.

Because Legacy short-circuit local reads is insecure, access to this feature is limited to the users listed in the value of dfs.block.local-path-access.user.

dfs.client.read.shortcircuit true dfs.client.use.legacy.blockreader.local true dfs.datanode.data.dir.perm 750 dfs.block.local-path-access.user foo,bar


【本文地址】

公司简介

联系我们

今日新闻

    推荐新闻

      专题文章
        CopyRight 2018-2019 实验室设备网 版权所有